negative preference
SynPO: Synergizing Descriptiveness and Preference Optimization for Video Detailed Captioning
Dang, Jisheng, Zhang, Yizhou, Ye, Hao, Wang, Teng, Chen, Siming, Zheng, Huicheng, Guo, Yulan, Lai, Jianhuang, Hu, Bin
Fine-grained video captioning aims to generate detailed, temporally coherent descriptions of video content. However, existing methods struggle to capture subtle video dynamics and rich detailed information. In this paper, we leverage preference learning to enhance the performance of vision-language models in fine-grained video captioning, while mitigating several limitations inherent to direct preference optimization (DPO). First, we propose a pipeline for constructing preference pairs that leverages the intrinsic properties of VLMs along with partial assistance from large language models, achieving an optimal balance between cost and data quality. Second, we propose Synergistic Preference Optimization (SynPO), a novel optimization method offering significant advantages over DPO and its variants. SynPO prevents negative preferences from dominating the optimization, explicitly preserves the model's language capability to avoid deviation of the optimization objective, and improves training efficiency by eliminating the need for the reference model. We extensively evaluate SynPO not only on video captioning benchmarks (e.g., VDC, VDD, VATEX) but also across well-established NLP tasks, including general language understanding and preference evaluation, using diverse pretrained models. Results demonstrate that SynPO consistently outperforms DPO variants while achieving 20\% improvement in training efficiency. Code is available at https://github.com/longmalongma/SynPO
Towards Unified Modeling for Positive and Negative Preferences in Sign-Aware Recommendation
Liu, Yuting, Dang, Yizhou, Liang, Yuliang, Liu, Qiang, Guo, Guibing, Zhao, Jianzhe, Wang, Xingwei
Recently, sign-aware graph recommendation has drawn much attention as it will learn users' negative preferences besides positive ones from both positive and negative interactions (i.e., links in a graph) with items. To accommodate the different semantics of negative and positive links, existing works utilize two independent encoders to model users' positive and negative preferences, respectively. However, these approaches cannot learn the negative preferences from high-order heterogeneous interactions between users and items formed by multiple links with different signs, resulting in inaccurate and incomplete negative user preferences. To cope with these intractable issues, we propose a novel \textbf{L}ight \textbf{S}igned \textbf{G}raph Convolution Network specifically for \textbf{Rec}ommendation (\textbf{LSGRec}), which adopts a unified modeling approach to simultaneously model high-order users' positive and negative preferences on a signed user-item interaction graph. Specifically, for the negative preferences within high-order heterogeneous interactions, first-order negative preferences are captured by the negative links, while high-order negative preferences are propagated along positive edges. Then, recommendation results are generated based on positive preferences and optimized with negative ones. Finally, we train representations of users and items through different auxiliary tasks. Extensive experiments on three real-world datasets demonstrate that our method outperforms existing baselines regarding performance and computational efficiency. Our code is available at \url{https://anonymous.4open.science/r/LSGRec-BB95}.
Exploiting Negative Preference in Content-based Music Recommendation with Contrastive Learning
Advanced music recommendation systems are being introduced along with the development of machine learning. However, it is essential to design a music recommendation system that can increase user satisfaction by understanding users' music tastes, not by the complexity of models. Although several studies related to music recommendation systems exploiting negative preferences have shown performance improvements, there was a lack of explanation on how they led to better recommendations. In this work, we analyze the role of negative preference in users' music tastes by comparing music recommendation models with contrastive learning exploiting preference (CLEP) but with three different training strategies - exploiting preferences of both positive and negative (CLEP-PN), positive only (CLEP-P), and negative only (CLEP-N). We evaluate the effectiveness of the negative preference by validating each system with a small amount of personalized data obtained via survey and further illuminate the possibility of exploiting negative preference in music recommendations. Our experimental results show that CLEP-N outperforms the other two in accuracy and false positive rate. Furthermore, the proposed training strategies produced a consistent tendency regardless of different types of front-end musical feature extractors, proving the stability of the proposed method.
gOCCF: Graph-Theoretic One-Class Collaborative Filtering Based on Uninteresting Items
Lee, Yeon-Chang (Hanyang University) | Kim, Sang-Wook (Hanyang University) | Lee, Dongwon (The Pennsylvania State University)
We investigate how to address the shortcomings of the popular One-Class Collaborative Filtering (OCCF) methods in handling challenging “sparse” dataset in one-class setting (e.g., clicked or bookmarked), and propose a novel graph-theoretic OCCF approach, named as gOCCF, by exploiting both positive preferences (derived from rated items) as well as negative preferences (derived from unrated items). In capturing both positive and negative preferences as a bipartite graph, further, we apply the graph shattering theory to determine the right amount of negative preferences to use. Then, we develop a suite of novel graph-based OCCF methods based on the random walk with restart and belief propagation methods. Through extensive experiments using 3 real-life datasets, we show that our gOCCF effectively addresses the sparsity challenge and significantly outperforms all of 8 competing methods in accuracy on very sparse datasets while providing comparable accuracy to the best performing OCCF methods on less sparse datasets. The datasets and implementations used in the empirical validation are available for access: https://goo.gl/sfiawn.
Preferences in Constraint Satisfaction and Optimization
Rossi, Francesca (University of Padova) | Venable, Kristen Brent | Walsh, Toby
In this case, all PCs will be considered, but some will be more preferred than others. Such concepts can be expressed in either a qualitative or a quantitative way. Preferences and constraints are closely related notions, since preferences can be seen as a form of "tolerant" constraints. For this reason, there are several constraint-based frameworks to model preferences. One of the most general frameworks, based on soft constraints (Meseguer, Rossi, and Schiex 2006), extends the classical constraint formalism to model preferences in a quantitative way, by expressing several degrees of satisfaction that can be either totally or partially ordered. When there are both levels of satisfaction and levels of rejection, preferences are bipolar and can be modeled by extending the soft constraint formalism (Bistarelli et al. 2006). Preferences can also be modeled in a qualitative way (also called ordinal), that is, by pairwise comparisons. In this case, soft constraints (or their extensions) are not suitable.